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PRQPIONIBACTERIUM VECTOR 

This invention relates to an endogenous piasmid of Propionibacterium, vectors derived 
from it and the use of these vectors to express (heterologous) proteins in bacteria, 
especially Propionibacteria. In particular transformed bacteria can be used either to 
produce, by fermentation, vitamin B12 or in cheese making. 

Introduction 

Propionibacteria are Gram-positive bacteria capable of producing various useful 
compounds in a variety of industrial processes. For example, several Propionibacterium 
species are known to produce vitamin B,2 (cobalamin) in large scale fermentation 
processes. Other species are used in dairy applications such as cheese manufacturing 
where they contribute, and in many cases even are mainly responsible, for the specific 
flavour and texture of the cheese. Many Propionibacterium species are considered safe for 
inclusion, as live organisms, into food and animal feed. 

To be able to fully exploit the biotechnological potential of Propionibacteriimi, 
efficient and flexible genetic engineering techniques are required. Such techniques rely on 
the availability of a suitable piasmid to express a protein from a heterologous gene in 
Propionibacterium. 

EP-A-0400931 (Nippon Oil) refers to an endogenous piasmid (pTY-1) from 
Propionibacterium pentosaceum (ATCC 4875) but does not describe its sequence or 
exemplify how it may be used to express a heterologous gene. 

JP 08-56673 refers to the piasmid pTY-1 for producing vitamin B12 but does not 
provide any evidence that the piasmid remains as a freely replicating extrachromosomal 
element nor that the piasmid is stable inside the transformed cells. 

The invention therefore seeks to provide vectors that are more eflBcient than those in 
the prior art, and can remain extrachromosomal and/or are stable. In particular the 
invention aims to provide an efficient vector for the cloning or expression of 
Propionibacterium or foreign genomic fragments or genes into a (Propionibacterium) host 
strain. This may enable host specific restriction enzymes to be circumvented and thereby 
avoid the host treating the piasmid as a foreign polynucleotide. 
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Summary of the invention 

Accordingly, the present invention in a first aspect provides a polynucleotide 
comprising a sequence capable of hybridising selectively to 
5 (a) SEQ ID NO: 1 or the complement thereof; 

(b) a sequence firom the 3.6 kb plasmid of Propionibacteriumfreudenreichii CBS 
101022; 

(c) a sequence from the 3 .6 kb plasmid of Propionibacteriumfreudenreichii CBS 
101023; or 

10 (d) a sequence that encodes a polypeptide of the invention, such as (at least part of) 

the amino acid sequence of SEQ ID NO: 2 or SEQ ID No: 3, or the complement 
thereof 

SEQ ID NO: 1 sets out the DNA sequence of the endogenous plasmid of 
Propionibacterium LMG 16545 which the inventors have discovered. The first coding 

1 5 sequence runs from nucleotide 273 to nucleotide 1 1 84 and the predicted amino acid 

sequence of this coding sequence is shown in SEQ ID NO: 2. The second coding sequence 
runs from nucleotides 1 1 81 to 1483 and the predicted amino acid sequence of this coding 
sequence is shown in SEQ ID No: 3. 

The inventors screened a large collection of Propionibacterium isolates and identified 

20 two strains both harboring cryptic plasmids with a size of 3.6 kb. One of the strains is 
Propionibacterium freudenreichii LMG 16545 which was deposited at Centraalbureau 
voor Schimmelcultures (CBS), Oosterstraat 1, Postbus 273, NL-3740 AG Baam, 
Netherlands, in the name of Gist-brocades B.V. of Wateringseweg 1, P.O. Box 1, 2600 MA 
Delft, The Netherlands, on 19 June 1998 under the terms of the Budapest Treaty and was 
» 25 given accession number CBS 101022. The oflier strain is Propionibacterium 

freudenreichii LMG 16546 which was also dqwsited by the same depositor on 19 June 
1998 under the terms of the Budapest Treaty at CBS and was given accession number CBS 
101023. 

Through full characterization and computer assisted analysis of the nucleotide 
30 sequence of LMG 16545 the inventors have been able to identify insertion sites for foreign 
DNA fiagments. These sites have allowed construction of plasmids that are still capable of 
autonomous replication in Propionibacterium. 
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Surprisingly it was found that an erythromycin resistance gene from the actinomycete 
Saccharopolyspora erythraea is efficiently expressed in Propionibacterium and thus can be 
used as a selection marker for transformed cells. 

Also constructed are bifunctional vectors, stably maintainable and selectable in both 
5 E.coli and Propionibacterium. This allows the use of £ coli for vector construction, as well 
as fimctional expression of homologous or heterologous genes in Propionibacterium. 
Vector construction using E. coli is comparatively easy and can be done quickly. 

The polynucleotide of the invention may be autonomously replicating or 
extrachromosomal, for example in a bacterium such as a Propionibacterium. 
1 0 Hence another aspect the invention provides a vector which comprises a 

polyniicleotide of the invention. 

The invention also provides a process for the preparation of a polypeptide, the 
process comprising cultivating a host cell transformed or transfected with a vector of 
the invention under conditions to provide for expression of the polypeptide. 
1 5 The invention additionally provides a polypeptide which comprises the sequence 

set out m SEQ ID NO: 2 or 3 or a sequence substantially homologous to that 
sequence, or a fragment of either sequence, or a protein encoded by a polynucleotide 
of the invention. 

Detailed Description of the Invention 

20 Polvnucleotides 

A polynucleotide of the invention may be capable of hybridismg selectively 
with the sequence of SEQ ID NO: 1, or a portion of SEQ ID No: 1, or to the 
sequence complementary to that sequence or portion of the sequence. The 
polynucleotide of the invention may be capable of hybridising selectively to the 

25 sequence of the 3.6 kb plasmid of P. freudenreichii CBS 101022 or CBS 101023, or 
to a portion of the sequence of either plasmid. Typically, a polynucleotide of the 
invention is a contiguous sequence of nucleotides which is capable of selectively 
hybridizing to the sequence of SEQ ID. No: 1 or of either 3.6 kb plasmid, or portion 
of any of these sequences, or to the complement of these sequences or portion of any 

30 of these sequences. 

A polynucleotide of the invention and the sequence of SEQ ID NO: 1, or either 
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of the 3.6 kb plasmids, or a sequence encoding a polypeptide, or a portion of these 
sequences, can hydridize at a level significantiy above background. Background 
hybridization may occur, for example, because of other polynucleotides present in 
the preparation. The signal level generated by the interaction between a 
polynucleotide of the invention and the sequence of SEQ ID NO: 1 or of either 3.6 
kb plasmid, or portion of these sequences, is typically at least 10 fold, preferably at 
least 100 fold, as intense as interactions between other polynucleotides and the 
coding sequence of SEQ ID NO: 1 or of either 3.6 kb plasmid, or a sequence 
encoding the polypeptide, or portion of these sequences. The intensity of interaction 
may be measured, for example, by radiolabelling the probe, e.g. with '^P. Selective 
hybridisation is typically achieved using conditions of medium stringency (for 
example 0.3M sodium chloride and 0.03M sodium citrate at about 50*0) to high 
stringency (same conditions but at about 60°C). 

Polynucleotides included in the invention can be generally at least 70%, 
preferably at least 80 or 90%, more preferably at least 95%, and optimally at least 
98% homologous (to the sequences (a) to (d)) over a region of at least 20, preferably 
at least 30, for instance at least 40, 60 or 100 or more contiguous nucleotides. 

Any combination of the above mentioned degrees of homology and minimum 
sizes may be used to define polynucleotides of tiie mvention, with the more stringent 
combinations (i.e. higher homology over longer lengths) being preferred. Thus for 
example a polynucleotide which is at least 80% or 90% homologous over 25, 
preferably over 30 nucleotides forms one embodiment of the invention, as does a 
polynucleotide which is at least 90% or 95% homologous over 40 nucleotides. 

The portions referred to above may be the coding sequences of SEQ ID No: 1 or 
of either 3 .6 kb plasmid. Other preferred portions of SEQ ID No: 1 are the 
replication origin, promoter or regulatory sequences, or sequences enable of 
effecting or assisting autonomous replication in a host cell, such as a 
Propionibacterium. 

It has been found that the portion of the plasmid from the restriction site SaR to 
Alwm appears to be the region that is required for replication of the plasmid. Other 
parts of the plasmid have been deleted and yet replication does not appear to have 
been adversely affected. Therefore in the invention sequence (b) and (c) can be the 
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region delineated by the restriction sites Sail and AlwNl. This is approximately 1 .7b 
in length. Alternatively, sequences (b) or (c) can be replaced by the sequence 
corresponding to nucleotides 1 to 1,800, such as 100 to 1,700, suitably 150 to 1,500, 
advantageously 200 to 1,300 and optimally 250 to 1,200 of SEQ. ID. No. 1. 

5 The proteins (SEQ. ID. Nos. 2 and 3) encoded by ORFl and 0RF2 respectively, 

are thought to both help the plasmid replicate. The plasmid replicates by the known 
rolling circle replication method where the original ds DNA plasmid is cut by either 
of the proteins which assists production of a copy of the outer strand using the inner 
strand as a template. The copy ofthe outer ring is removed and the ends joined. The 

1 0 host then replicates a new inner ring using the generated outer ring as a template. 
The coding sequences of the invention may be modified by nucleotide 
substitutions, for example from 1, 2 or 3 to 10, 25, 50 or 100 substitutions. The 
polynucleotides of sequences (a) to (d) may altematively or additionally be modified 
by one or more insertions or deletions and/or by an extension at either or both ends, 

15 Degenerate substitutions may be made and/or substitutions may be made which 

would result in a conservative amino acid substitution when the modified sequence is 
translated, for example as discussed later with relation to the polypeptides ofthe 
invention. 

Polynucleotides ofthe invention may comprise DNA or RNA. They may also be 
20 polynucleotides which include v^thin them synthetic or modified nucleotides. A 
number of different types of modification to polynucleotides are knovm in the art. 
These include methylphosphonate and phosphorothioate backbones, addition of 
acridine or polylysine chains at the 3' and/or 5' ends of the molecule. For the 
purposes ofthe present invention, it is to be understood that the polynucleotides 
25 described herein may be modified by any method available in the art. Such 

modifications may be carried out in order to enhance in vivo activity or lifespan. 

Polynucleotides of the invention may be used as a primer, e.g. a PGR 
(polymerase chain reaction) primer, a primer for an alternative amplification 
reaction, a probe e.g. labelled with a revealing label by conventional means usmg 
30 radioactive or non-radioactive labels, or the polynucleotides may be incorporated or 
cloned into vectors. 

Such primers, probes and other fragments will be at least 15, preferably at least 



wo 99/67356 




PCT/EP99/04416 



20, for example at least 25, 30 or 40 nucleotides in length. They will typically be up 
to 40, 50, 60, 70, 100 or 150 nucleotides in length. Probes and fragments can be 
longer than 150 nucleotides, for example up to 200, 300, 500, 1,000 or 1,500 
nucleotides in length, or even up to a few nucleotides, such as 5 or 10 nucleotides, 

5 short of any of the sequences (a) to (d). 

Polynucleotides such as a DNA polynucleotide and primers according to the 
invention may be produced recombinantly, synthetically, or by any means available 
to those of skill in the art. They may also be cloned by standard techniques. The 
polynucleotides are typically provided in isolated and/or purified form. 

1 0 In general, primers will be produced by synthetic means, involving a step wise 

manufacture of the desired nucleic acid sequence one nucleotide at a time. 
Techniques for accomplishing this using automated techniques are readily available 
in the art. 

Longer polynucleotides will generally be produced using recombinant means, 
1 5 for example using PGR cloning techniques. This will involve making a pair of 

primers (e.g. of about 15-30 nucleotides) to the region of SEQ ID No: 1 or of either 
3.6 kb plasmid which it is desired to clone, bringing the primers into contact with 
DNA obtained fiom a Propionibacterium, periTorming a polymerase chain reaction 
under conditions which bring about amplification of the desired region, isolating the 
20 amplified fragment (e.g. by purifying the reaction mixture on an agarose gel) and 
recovering the amplified DNA. The primers may be designed to contain suitable 
restriction enzyme recognition sites so that the amplified DNA can be cloned into a 
suitable cloning vector. Such techniques may be used to obtain all or part of SEQ 
ID No: 1 or either 3.6 kb plasmid. 
25 The techniques mentioned herein are well known in the art'". 

Polynucleotides which are not 100% homologous to SEQ ID No: 1 or either 3.6 
kb plasmid but fall within the scope of the invention can be obtained in a number of 
ways. 

Homologous polynucleotides of SEQ ID NO: 1 or of either 3.6 kb plasmid may 
30 be obtained for example by probing genomic DNA libraries made from a range of 
Propionibacteria, such as P.freudenreichii, PJensenii, P. thoenii, P.acidipropiomci, 
or other strains of bacteria of the class Actinomycetes, or other gram positive bacteria. 
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or those that are.G: C rich. All these organisms are suitable sources of homologous 
or heterologous genes, promoters, enhancers, or host cells, for use in the invention. 

Such homologues and fragments thereof in general will be capable of 
selectively hybridizing to the coding sequence of SEQ ID NO: 1 or its complement 

5 or of either 3 .6 kb plasmid. Such sequences may be obtained by probing genomic 
DNA libraries of the Propionibacterium with probes comprising all or part of the 
coding sequence SEQ ID NO: 1 or of either 3.6 kb plasmid under conditions of 
medium to high stringency (for example 0.03M sodium chloride and 0.3M sodium 
citrate at from about 50*^0 to about 60°C ). 

1 0 Homologues may also be obtained using degenerate PGR which will use primers 

designed to target conserved sequences within the homologues. Conserved 
sequences can be predicted from aligning SEQ ID No: 1 or the sequence of either 3.6 
kb plasmid with their homologues. The primers will contain one or more degenerate 
positions and will be used at stringency conditions lower than those used for cloning 

1 5 sequences with single sequence primers against known sequences. 

Alternatively, such polynucleotides may be obtained by site directed 
mutagenesis of SEQ ID No: 1 or of either 3.6 kb plasmid, or their homologues. This 
may be useful where for example silent codon changes are required to sequences to 
optimise codon preferences for a particular host cell in which the polynucleotide 

20 sequences are being expressed. Other sequence changes may be desired in order to 
introduce restriction euTyme recognition sites, or to alter the property or function of 
the polypeptides encoded by the polynucleotides. 

Methods of measuring polynucleotide homology are well known in the art. For 
example the U WGCG package provides the BESTFIT programme which can be used 

25 to calculate homology, for example used on its default setting^. For amino acid 

homology with regard to polypeptides of the invention which are discussed later, one 
can employ BLAST (Basic Local Alignment Search TooP), which produces 
alignments of amino acid sequences (and nucleotide sequences if necessary) to 
determine sequence similarity, BLAST can thus be used to determine exact matches 

30 or to identify homologues, and is particularly useful for those matches which do not 
contain gaps. The BLAST technique uses the algorithm based on the High-scoring 
Segment Pair (HSP). 
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The invention includes double stranded polynucleotides comprising a 
polynucleotide sequence of the invention and its complement. 

Polynucleotides (e.g. probes or primers) of the invention may carry a revealing 
label. Suitable labels include radioisotopes such as "P or ^*S, enzyme labels, or other 
5 protein labels such as biotin. Detection techniques for these labels are known per se. 

The polynucleotides (labelled or unlabelled) may be used in nucleic acid-based 
tests for detecting or sequencing another polynucleotide of the invention, in a 
sample. 

Polynucleotides of the invention mclude variants of the sequence of SEQ ID 
10 NO: 1 or of either 3.6 kb plasmid which are capable of autonomously replicating or 
remaining extrachromosomally in a host cell. Such variants may be stable in a 
bacterium such as a Propionibacterium. 

Generally the polynucleotide will comprise the repUcation origin and/or coding 
region(s) of SEQ ID No: 1 or of either 3.6 kb plasmid, or homologues of these 
1 5 sequences. A polynucleotide of the invention which is stable in a host cell, such as 
Propionibacterium, or E. coli is one vdiich is not lost from the host within five 
generations, such as fifteen generations, preferably thirty generations. Generally 
such a polynucleotide would be inherited by both daughter cells every generation. 
The polynucleotide may comprise a promoter or an origin of replication (e.g. 
20 upstream of any sequences encoding a replication protein). 

The polynucleotide of the invention can be transformed or transfected into a 
bacterium, such Propionibacterium, or £. coli, for example by a suitable 
method". It may be present in a bacterium at a copy number of 5 to 500, such as 10 
to 100. 

25 The polynucleotide may be cj^able of autonomous replication in a bacterium 

other than a Propionibacterium, Such a bacterium may be E. coli, or a gram positive 
or G:C rich bacterium or one of the class Actinomycetes. Such a polynucleotide will 
generally comprise sequences which enable the polynucleotide to be autonomously 
replicated in that bacterium. Such sequences can be derived from plasmids which are 

30 able to replicate in that bacterium. 

A polynucleotide of the invention may be one which has been produced by 
replication in a Propionibacterium. Alternatively it may have been produced by 
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replication in another bacterium, such as E, coli. The polynucleotide may be able to 
circumvent the host restriction systems of Propionibacterium. 

Vectors 

A second aspect of the invention relates to a vector comprising a polynucleotide 
5 of the first aspect. The vector may be capable of repUcation in a host cell, such as a 
bacterium, for examples Actinomycetes, e.g. Propionibacterium or E. coll The 
vector may be a linear polynucleotide or, more usually, a circular polynucleotide. 
The vector may be a hybrid of the polynucleotide of the invention and another 
vector. The other vector may be an E. coii vector, such as pBR322, or a vector of the 
10 pUC family, Rl, ColD or RSFlOlO or a vector derived therefrom. 

The polynucleotide or vector of the invention may be a plasmid. Such a plasmid 
may have a restriction map the same as or substantially similar to the restriction 
maps shown in Figure 1,2a or 2b. 

The polynucleotide or vector may have a size of 1 kb to 20 kb, such as from 2 to 
15 10 kb, optimally from 3 to 7 kb. 

The polynucleotide or vector may comprise multiple functional cloning sites. 
Such cloning sites generally comprise the recognition sequences of restriction 
enzymes. The polynucleotide or vector may comprise the sequence shown in SEQ 
ID No: 1 and/or contains restriction enzyme recognition sites for £;coRI, Sad, 
20 Alwm, Bsml BsaBl, BcR, Apal, Hindlll SaR, Hpal, Pst\, Sphl, 5amHI, Acc65l 
EcoRV and BgRh The polynucleotide or vector may thus comprise one, more than 
one or all of these restriction enzyme sites, generally in the order shown in the 
Figures. 

Preferably, when present m a bacterium, such as a Propionibacterium or E. colU 
25 the polynucleotide or vector of the invention does not integrate into the chromosome 
of the bacterium. Generally the polynucleotide or vector does not integrate within 5 
generations, preferably 20 or 30 generations. 

The polynucleotide or vector may be an autonomously replicating plasmid that 
can remain extrachromosomal inside a host cell, Avhich plasmid is derived from an 
30 endogenous Propionibacterium plasmid, and when comprising a heterologous gene 
(to the host)is capable of expressing that gene inside the host cell. The term "derived 
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from" means that the autonomously rephcating plasmid includes a sequence the same 
as the polynucleotide of the invention. 

The vector of the invention may comprise a selectable marker. The selectable 
marker may be one which confers antibiotic resistance, such as ampicillm, 

5 kanamycin or tetracylin resistance genes. The selectable marker may be an 
erythromycin resistance gene. The erythromycin resistance gene may be from 
Actinomycetes, such as Saccharopolyspora erythraea, for example from 
Saccharopolyspora erythraea NRRL2338. Other selectable markers which may be 
present in the vector include genes confening resistance to chloramphenicol, 

1 0 thiostrepton, viomycin, neomycin, apramycin, hygromycin, bleomycin or 
streptomycin. 

The vector may be an expression vector, and so may comprise a heterologous 
gene (which does not naturally occur in the host cell, e.g. Propionibacterid), or an 
endogenous or homologous gene of the host cell, e.g. Propionibacteria. In the 

1 5 expression vector the gene to be expressed is usually operably linked to a control 
sequence which is capable of providing for the expression of the gene in a host cell. 

The term "operably linked" refers to a juxtaposition wherein the components 
described are in a relationship permitting them to function in their intended manner. 
A controlled sequence "operably linked" to a coding sequence is ligated in such a 

20 way that expression of the coding sequence is achieved under conditions compatible 
with the control sequences. 

The heterologous or endogenous gene may be inserted between nucleotides 1 
and 200 or between nucleotides 1 500 to 3555 of SEQ ID No: 1 or at an equivalent 
position in a homologous polynucleotide. 

25 Such genes may comprise homologous or endogenous genes such as for 

elongation factors, promoters regulatory sequences or elements, and replication 
proteins. Other genes (which may be heterologous to the host) include those 
encoding for or assisting in the production of nutritional factors, immunomodulators, 
hormones, proteins and enzymes (e.g. proteases, amylases, peptidases, lipases), 

30 texturing agents, flavouring substanpes (e.g. diacetyl, acetone), gene clusters, 

antimicrobial agents (e.g. nisin), substances for use in foodstuffs (e.g. in sausages, 
cheese) metabolic enzymes, vitamins (e.g. 8,2), uroporphyrinogen (III) 
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methyltransferase (UP HI MT), cob A, antigens and (e.g. for vaccines) therapeutic 
agents. As will be seen, the hosts can produce a wide variety of substances, not just 
polypeptides, which may be either the desired product or may be used to produce the 
desired product. 

The heterologous gene may have a therapeutic effect on a human or animal. 
Such a gene may comprise an antigen, for example from a pathogenic organism. The 
host, such as Propionibacterium, comprismg a polynucleotide with such a 
heterologous gene may be used as or in a vaccine, and may provide protection 
against the pathogens. 

The heterologous antigen may be a complete protein or a part of a protein 
containing an epitope. The antigen may be from a bacterium, a virus, a yeast or a 
fungus. 

Host cells and expression 

The host cell forms the third aspect of the invention and comprises a 
polynucleotide or vector of the first or second aspect. The host cell may be a 
bacterium e.g. of the class Actinomycetes. The bacterium may be a 
Propionibacterium or E. coli. The Propionibacterium may be P. freudenreichii, 
P. jensenii, P. thoenii or P. acidipropionici. 

In a fourth aspect the invention provides a process for producing a host cell of 
the third aspect, the process comprising transfonning or transfecting a host cell with 
a polynucleotide or vector of the first or second aspect, e.g. with known 
transformation techniques". 

In a fifth aspect the invention provides a process for the preparation of a 
polypeptide encoded by the polynucleotide or vector of invention present in host cell 
of the invention comprising placing or culturing the host cell in conditions where 
expression of the polypeptide occurs. 

This aspect of the invention thus provides a process for the preparation of a 
polypeptide encoded by a given gene, which process comprises cultivating a host cell 
transformed or transfected with an expression vector comprising the gene, under 
conditions to provide for an expression of the said polypeptide, and optionally 
recovering the expressed polypeptide. The host cell may be of the class 
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Actinomyceies, or a gram positive bacteria such as Propionibacterium or E. coli. 

Promoters, translation initiators, translation terminators, elongation factor genes, 
ribosomal RNA, antibiotic resistance genes, synthetic promoters (e.g. designed on 
consensus sequences) to other expression regulation signals present in the 
5 polynucleotide or vector can be those which are compatible with expression in the 
host cell. Such promoters include the promoters of the endogenous genes of the host 
cell. 

Culturing conditions may be aerobic or anaerobic conditions, depending on the 
host. For a fermentation process the host cell would be placed in anaerobic, and then 

10 possibly aerobic, conditions. The compound produced, such as an expressed 

polypeptide, may then be recovered, e.g. from the host cell or fermentation medium. 
The expressed polypeptide may be secreted from the host cell. Alternatively the 
polypeptide may not be secreted from the host cell. In such a case the polypeptide 
may be expressed on the surface of the host cell. This may be desirable, for example, 

15 if the polypeptide comprises an antigen to which an immune response is desired in 
human or animal. 

A homologous gene that may be present in the vector of the invention may be 
cobA, A host cell comprising this vector may therefore be equable of producing a 
compound such as vitamin B12 from a substrate or the compound may be the product 
20 of an enzyme. The invention specifically provides a process for the preparation of 
vitamin B,2 comprising cultivating or fermenting such a host cell under conditions in 
which the UP(III) MT gene is expressed. The expressed enzyme can be contacted 
with a suitable substrate under conditions m which the substrate is converted to 
vitamin B12. This may result in increased production of vitamin Bn- 

25 Therapeutics 

As described above the polynucleotide of the invention may comprise a 
heterologous gene which is a therapeutic gene. Thus the mvention includes a host 
cell comprising a vector of the invention which comprises a therapeutic gene for use 
in a method of treatment of the human or anhnal body by then^y. Such a host cell 
30 may be Propionibacterium. The host cell may be alive or dead. 

The host cell can be formulated for clinical administration by mixing them with 
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a phannaceutically acceptable carrier or diluent. For example they can be formulated 
for topical, parenteral, intravenous, intramuscular, subcutaneous, oral or transdermal 
administration. The host cell may be mixed with any vehicle which is 
pharmaceutically acceptable and appropriate for the desired route of administration. 
The pharmaceutically acceptable carrier or diluent for injection may be, for example, 
a sterile or isotonic solution such as Water for injection or physiological saline. 

The dose of die host cells may be adjusted according to various parameters, 
especially according to the type of the host cells used, the age, weight and condition 
of the patient to be treated; the mode of administration used; the condition to be 
treated; and the required clinical regimen. As a guide, the number of host cells 
administered, for example by oral administration, is from 10' to 10" host cells per 
dose for a 70 kg adult humaiL 

The routes of administration and dosages described are intended only as a guide 
since a skilled practitioner will be able to determine readily the optimum route of 
administration and dosage of any particular patient and condition. 

Polypeptides 

A sixth aspect of the invention provides a polypeptide of the invention 
comprising one of the amino acid sequences set out in SEQ ID NO: 2 or 3 or a 
substantially homologous sequence, or of a fragment of either of these sequences. 
The polypeptide may be one encoded by a polynucleotide of the first aspect. In 
general, the naturally occurring amino acid sequences shown in SEQ ID NO: 2 or 3 
are preferred. However, the polypeptides of the invention include homologues of the 
natural sequences, and fragments of the natural sequoices and their homologues, 
which have the activity of the naturally occurring polypeptides. One such activity 
may be to effect the replication of the polynucleotide of the invention. In particular, 
a polypeptide of the invention may comprise: 

(a) the protein of SEQ ID No: 2 or 3; or 

(b) a homologue thereof from Actinomycetes, such as Propionibacterium 
frettdenreichii or other Propionibacterium strains; or 

(c) a protein at least 70% homologous to (a) or (b). 

A homologue may occur naturally in a Propionibacterium and may function in a 
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substantially similar manner to a polypeptide of SEQ ID NO: 2 or 3. Such a 
homologue may occur in Actinomycetes or gram positive bacteria. 

A protein at least 70% homologous to the proteins of SEQ ID NO: 2 or 3 or a 
homologue thereof will be preferably at least 80 or 90% and more preferably at least 

5 95%, 97% or 99% homologous thereto over a region of at least 20, preferably at least 
30, for instance at least 40, 60 or 100 or more contiguous amino acids. Methods of 
measuring protein homology are well known in the art and it will be understood by 
those of skill in the art that in the present context, homology is calculated on the 
basis of amino acid identity (sometimes referred to as "hard homology"). 

1 0 The sequences of the proteins of SEQ ID NO: 2 and 3 and of homologues can 

thus be modified to provide oflier polypeptides within the invention. 

Amino acid substitutions may be made, for example from 1, 2 or 3 to 10, 20 or 
30 substitutions. The modified polypeptide generally retains its natural activity. 
Conservative substitutions may be made, for example according to the following 

1 5 Table. Amino acids in the same block in the second column and preferably in the 
same line in the third column may be substituted for each other: 



ALIPHATIC 


Non-polar 


GAP 




ILV 




Polar - uncharged 


CSTM 






NQ 




Polar - charged 


DE 






KR 


AROMATIC 




HFWY 



Polypeptides of the invention also include fragments of the above-mentioned 
20 full length polypeptides and variants thereof, includmg fragments of the sequences 
set out in SEQ ID NO: 2 or 3. Such fragments can retain the natural activity of the 
frill-length polypeptide. 

Suitable fragments will be at least about 5, e.g. 10, 12, 15 or 20 amino acids in 
size. Polypeptide fragments of SEQ ID No: 2 and 3 and homologues thereof may 
25 contain one or more (e.g. 2, 3, 5, or 1 0) substitutions, deletions or insertions. 
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including conserved substitutions. 

Polypeptides of the invention may be in a substantially isolated form. A 
polypeptide of the invention may also be in a substantially purified form, in which 
case it will generally comprise the polypeptide in a preparation in which more than 
5 90%, e.g. 95%, 98% or 99% of the polypeptide in the preparation is a polypeptide of 
the invention. 

A polypeptide of the invention may be labelled with a revealing label. The 
revealing label may be any suitable label which allows the polypeptide to be 
detected. Suitable labels include radioisotopes, e.g. ^^^I, ^^S, enzymes, antibodies, 
10 polynucleotides and linkers suchasbiotin. 

Industrial ApDlications 

As will be apparent from the discussion, the host cells of the third aspect can be 
used to produce not only the recombinant proteins, but also other compounds of 
interest, including non-proteins such as morganic chemicals, in particular vitamins. 

1 5 A seventh aspect of the present invention therefore relates to a process for the 

production of a compound, the process comprising culturing or fermenting host cells 
of the third aspect under conditions whereby the desked compound is produced. 
Although this compound may be a polypeptide, for example a polypeptide of the 
sixth aspect, it may also be one of the compounds mentioned in the previous 

20 discussion concerning genes to be expressed. Clearly inorganic compounds will not 
be expressed by a gene, but they may be produced by an enzyme, or tiie polypeptide 
or enzyme may assist the host cell in the production of the desired compound. These 
compounds may be produced inside the cell, and later isolated, for example 
following lysis of the host cell, or they may pass through the wall of the host cell into 

25 a surrounding medium, which may be a fermentation medium, for example an 

aqueous solution. In this way, the host cells can be cultured in an aqueous medium 
that comprises cells and nutrients for the cells, for example assimilable sources of 
carbon and/or nitrogen. 

The polypeptides so produced may have therapeutic uses. They may be drugs or 

30 other pharmacologically active compounds, or may be antigenic or immunogenic, in 
which case they may find use in vaccines. 
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The invention additionally encompasses compounds produced by this process, 
whether or not it is a recombinant polypeptide. Compounds specifically 
contemplated are vitamins, such as vitamin Bn (cobalamin). 

In some cases the compound need not be isolated either from the fermentation 
5 medium or from the host cells. The host cells may themselves be used in particular 
applications, for example in, or in the manufacturing of, foodstuffs such as sausages, 
or in cheese making, or the host cells may for example be included in an animal feed, 
such as when the host cells contain a compound to be ingested by the animal in 
question. The invention therefore extends to the use of these compounds or the host 

1 0 cells, in the production of foodstuffs such as cheeses and sausages. The invention 
also contemplates foodstuflfs or animal feed comprising host cells or a compoimd 
produced by the invention. 

In a particularly preferred embodiment of the present mvention the host cells can 
be used in a cheese making process, and so the invention additionally includes a 

1 5 process for manufacturing cheese where the microorganisms employed are host cells 
of the invention. The host cells may be used instead of or in addition to, other 
bacteria, such as lactic acid bacteria. Propionic acid bacteria are currently used in 
cheese making processes, for example with mesophilic cultures (Maasdam type of 
cheese) as well as thermophilic cultures (Emmental). Both mesophilic and 

20 thermophilic organisms can be responsible for the acidification of the milk or cheese. 
In this way the host cells of the invention can be not only used for cheese but also for 
the production of other fermented dairy products (e.g. yoghurts). Propionic acid 
bacterium are employed in cheese making because of their ability to convert lactate 
and carbohydrates to propionic acid, acetic acid and carbon dioxide. The host cells 

25 of the invention, especially if they are propionibacteria, can be employed because 
they can be less sensitive to nitrates and salt, which may allow the reduction or 
omission of bactofiigation of the milk (usually employed to reduce the levels of 
Clostridia). 

The fermentation of the host cells may have one or two phases or stages. These 
30 may be for example a growth and/or production phase, or anaerobic and/or aerobic 
phase. Preferably, there will be a growth and/or anaerobic phase, and suitably also 
(e.g. afterwards) a production and/or aerobic phase. 
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Both the carbon and/or nitrogen sources may be complex sources or individual 
compounds. For carbon, it is preferred that this is glucose. For nitrogen, appropriate 
sources include yeast extract or ammonia or ammonium ions. 

Preferred features and characteristics of one aspect of the invention are suitable 
5 for another aspect mutatis mutandis. 

Figures 

The invention is illustrated by the accompanying drawings in which: 
Figure 1 is a restriction map of a vector within the invention, p545 obtained 

from P. freudenreichii LUG 16545 (CBS 101022); and 
10 Figures 2a and 2b each contain two maps of two vectors, all four vectors being 

within the invention. 

The invention will now be described, by way of example, by reference to the 

following Examples, which are not to be construed as being limiting. 
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EXAMPLE 1 

Screening of Pronionibacterium strains 

A collection of 75 nonpathogenic strains of Propionibacterium was screened for 
5 the presence of indigenous plasmids. The majority of strains were obtained from the 
BCCM/LMG culture collection (Ghent, Belgium), although some strains were 
obtained from ATCC (Rockville, Md., USA) or from DSM (Braunschweig, 
Germany). Screening was performed using a small scale plasmid isolation procedure. 
First bacteria were cultivated anaerobically in MRS medium* for 48 hrs at 30 °C. 

1 0 Plasmids were then purified from the bacteria using a plasmid DNA isolation 

procedure originally developed for E. col^ with some modifications: cells from a 5 
ml culture were washed in a 25% sucrose, 50 mM Tris-HCl pH8 solution, 
resuspended in 250nl TENS (25% sucrose + 50mM NaCl + 50 mM Tris-HCl 4- 5mM 
EDTA pH8), containing lOmg/ml lysozyme (Boehringer Mannheim), and incubated 

1 5 at 3TC for 20-30 minutes. The bacterial cells were then lysed in 500 ^1 of 0.2 N 
NaOH/1% SDS (2-5 minute incubation on ice). After addition of 400^1 3M NaAc 
pH4.8 (5 minutes on ice) and subsequent extraction with phenol/chloroform, the 
DNA was precipitated by addition of isopropanol. 

The DNA was analysed by electrophoresis on 1% agarose gels, and visualised 

20 by ethidium bromide. Whereas most strains were negative, i.e. did not reveal the 
presence of indigenous plasmids in this analysis, the majority of strains that proved 
positive contained large (^20 kb) plasmids. Smaller plasmids were observed in 6 
strains. Of these, PJensenn LMG16453, P, acidipropionici ATCC4875, P. 
acidipropionici LMG16447 and a nonspecified Propionibacterium strain 

25 (LMG16550) contained a plasmid in the size range of 6-10 kb. Two strains (P. 

freudenreichii LMG16545 and P.freudenreichii LMG16546) showed an identical 
plasmid profile of 2 plasmids. One plasmid was large (size not determined) and the 
other was smaller, more abundantly present and had a size of 3.6 kb. These 3.6 kb 
plasmids fi-om LMG16545 and LMG 16546 were chosen for fiirther analysis. 
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EXAMPLE 2 

Analysis of an indigenous plasmid from st rains LMG1654S and LMG16S46 

The 3.6 kb plasmids were isolated from both strains and further purified by 
CsCl-ethidium bromide density gradient ultracentrifugation". Limited restriction 
5 maps were made of both preparations and these turned out to be identical" . The 

restriction map of the 3.6 kb plasmid is shown in Fig.l . Restriction enzymes and T4 
ligase were obtained from New England Biolabs or GIBCO BRL. 

The 3.6 kb plasmid from strain LMG16545 (from here on referred to as p545) 
was radioactively labelled and used in Southern blot hybridization experiments. 

10 Hybridisation conditions were 0.2 x SSC, 65°C. It reacted equally well with both 

LMG16545 and LMG16546 plasmid DNA extracts, supporting the close relationship 
of these strains, whereas a plasmid DNA extract from P.acidipropionici 
ATCC4875, that harbors a 6.6 kb plasmid called pTYl or pRGOl^ failed to react. 
The DNA sequence of plasmid p545 was determined with fluorescent dye 

1 5 labelled dideoxyribonucleotides in an Applied Biosystems 373 A automatic 

sequencer, and is mcluded as SEQ ID No: 1 m the sequence listing. Sequence 
analysis was performed on plasmid DNA that had been linearized with EcoSl and 
inserted mto EcoBl digested pBluescript SKII+ DNA (Stratagene, La Jolla, Ca., 
USA). Computer assisted analysis of the sequence thus obtained using BLAST 

20 search^ revealed homologies to proteins involved in replication of plasmids from 
several GC-rich organisms (e.g., pALSOOO encoded rep A andrepB from 
Mycobacterium fortuitum^^^'^ show 28-30% identity and 34-38% similarity with the 
respective putative replication proteins from plasmid p545; pXZ10142 from 
Corynebacterium glutamicum [PIR Accession Number S32701] is another example 

25 of plasmids encoding replication proteins homologous to the pS4S putative 

replication proteins). The results of the database comparisons with homologoxis 
sequences are detailed in Examples 7 and 8. 

EXAMPLE 3 

Construction of E. coWPropionihacterium shutde vectors 
30 £, coli plasmid pBR322 was digested with EcdBl and Aval and the smaller 
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fragment thus generated (measuring 1 .4 kb and encompassing the tetracyclin 
resistance conferring gene) was replaced by a synthetic duplex DNA. The synthetic 
duplex DNA was designed so as to link EcoBl and Aval ends and to supply a 
number of unique restriction enzyme recognition sites: 

5'- 

AATTCAAGCTTGTCGACGTTAACCTGCAGGCATGCGGATCCGGTACCGAT 

ATCAGATCT - 3' (SEQ. ID. NO. 4) 

3'- 

GTTCGAACAGCTGCAATTGGACGTCCGTACGCCTAGGCCATGGCTATAGT 
CTAGAAGCC - 5' (SEQ.ID.NO. 5) 

The following restriction enzyme recognition sites were supplied in this way: 
Ecom (restored), Hindm, Sail, Hpal, Pstl, Sphl, BamUi, Acc65l EcoKW, BglU 
(Aval is not restored). 

This synthetic DNA was ligated to the large fragment and the ligation mixture 
transferred back to E. colt (T4 ligase was used). A plasmid of the expected 
composition was obtained (pBR322AI). The multiple cloning site can be used to 
introduce a selection maricer as well as plasmid p545 DNA. 

As an example the construction of an E. colil Propionibacterium shuttle plasmid 
conferring resistance to erythromycin was performed as will now be described. 

A 1.7 kb Acc65l fragment from the Saccharopolyspora erythraea NRRL2338 
ery&romycin biosynthesis cluster and containing the erythromycm resistance 
conferring gene"-* was mserted into Acc6Sl Unearized pBR322AI. Then the newly 
derived construct, named pBRES, was linearized with EcoRV and ligated to p545 
DNA that had been digested with BsaBl. Kcoli transformants were found to harbor a 
vector witii the correct insert, in both oricntations.The resulting plasmid vectors were 
named pBRESP36Bl and pBRESP36B2 (Figs. 2a and 2b). 

Plasmid vector constructs were also obtained with p545 DNA linearized in an 
other restriction site situated outside the putative replication region, namely ^/wNI. 
For this construction the pBRES vector had to be provided with a suitable cloning 
site. An adaptor was designed consisting of two complementary oligonucleotides of 
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the following composition (SEQ- ID. Nos 6 and 7): 

5' GTACCGGCCGCTGCGGCCAAGCTT 3' 
5' GATCAAGCTTGGCCGCAGCGGCCG 3' 

Annealing of these oligo's created a double stranded DNA fragmrait withi4cc65I 
5 and BgUl cohesive ends respectively, which moreover contains an internal Sfil 

restriction site, that provides ends compatible to the AlwM digested p545 plasmid. 

This adaptor was cloned in pBRES between the 5g/II and the proximal Acc65l site. 

The pBRES-Sfi vector thus obtamed was subsequently digested by SfH and ligated to 

AlwW. digested p545. Transformation of E.coli yielded transformants with the 
1 0 correct vector as confirmed by restriction enzyme analysis. The vector obtained was 

named pBRESP36A (Fig.2). 

EXAMPLE 4 

Transformation of ProBionibacterium wit h R. coW Prop^nnihacterium shutde 
vectors 

1 5 Transformation of Propionibacterium freudenreichii strain ATCC6207 with 

pBRESP36Bl will be described. 

The bacterial cells are cultivated m SLB (sodium lactate broth" at 30*C to a 
stationary growth phase, and subsequently diluted 1:50 in fresh SLB. After 
incubation at 30*»C for around 20 hours, cells (now in the exponential growth phase) 

20 wrae harvested and washed extensively in cold 0.5M sucrose. Subsequently cells 
were washed once in the electroporation buffer, consistii^ of 0.5M sucrose, buffered 
by ImM potassiumacetate, pH5.5, and finally resuspaided in this electroporation 
buffer in about 1/100 of the original culture volume. Cells were kept on ice during 
the whole procedure. 

25 For the electroporation (apparatus fix)m BIORAD), 80 - 1 00 ^l of cell 

suspension was mixed with ±1 jig of DNA (or smaller amounts), in a cooled 1 or 2 
mm electroporation cuvette, and an electric pulse delivered. Optimal pulse conditions 
were found to be 25kV/cm at 200 Q resistance and 25\iF capacitance. However, 
lower and higher voltages (also at lOOQ) also yield transformants. 
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Immediately after the pulse, 900 ^il cold SLB containing 0,5 M sucrose was 
added to the pulsed cell suspension and these are subsequently incubated for 2.5 to 3 
hours at 30°C before plating appropriate dilutions on SLB/agar plates containing 0.5 
M sucrose and lO^g/ml erythromycin. After a 5 to 7 day incubation period at SO'^C 
5 imder anaerobic conditions, transfi)nnants were detected. 

DNA isolated from E. coli DH5a (Promega) yielded a transformation efficiency 
of 20 - 30 transformants per |ig DNA. A 10-100 fold higher efficiency is achieved 
when DNA is isolated from £. coli JMl 10 (dam', dcm' strain). E, coli transformation 
was done according to BIORAD instructions. 
1 0 Transformants contained the authentic vectors, indistinguishable from the 

original plasmid DNA used for transformation of ATCC6207. This was shown by 
restriction enzyme analysis of plasmid DNA isolated from the transformants by the 
small scale plasmid DNA isolation procedure refered to before. 

Vectors were exclusively present as autonomously replicating plasmids. 
1 5 Southern blot hybridization*^ with total DNA isolates showed that chromosomal 
DNA did not hybridise to the vector DNA used as a probe, indicating that no 
chromosomal integration of plasmid DNA occured. 

Transformation was also successfiil with vectors pBRESP36B2 and 
pBRESP36A, indicating that fimctionality of the vector was independent of the 
20 orientation of p545 or the cloning site used. Also in this case the authenticity of the 
vectors was confirmed. 

Moreover, transformation of P. freudenreichii strain ATCC6207 with DNA 
isolated from a Propionibacterium transformant resulted in a 10^-10^ fold increased 
transformation eflBciency as compared to that obtained with DNA isolated from E. 
25 co/iDH5a. 

Transformation of another P.freudenreichii strain, LMG16545 (the same strain 
from which the p545 plasmid was obtained), resulted in a transformation efficiency 
comparable to that of the ATCC strain. 

The results of the transformations, and the effect on vitamin B,2 production, is 
30 shown in the following Table. 
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Eight out of 10 transforaiants gave up to a 50% higher vitamin B,2 content than 
the control strain. 



1 Mrain lu 


1 ransionneu piasnuQ 


(mg/g dry matter) 


COBl 


pBRES36COB 


0,57 


COB2 


pBRES36COB 


0.67 


COBS 


pBRES36COB 


0.83 


C0B4 


pBRES36COB 


0.68 


COBS 


pBRES36COB 


0.69 


COB6 


pBRES36COB 


0.61 


COB7 


pBRES36COB 


0.53 


COBS 


pBRES36COB 


0.64 


C0B9 


pBRES36COB 


0.50 


COB 10 


pBRES36COB 


0.74 


recATCC6207 


PBRESP36B2 


0.54 



EXAMPLES 

Construction of plasmid vector containing the cob A gene 

The construction and application of a plasmid vector to increase the level of 
20 vitamin (cobalamin) synthesis in P. freudenreichii strain ATCC6207 will be 
described. 

The promoter region of the gene conferring erythromycin resistance in 
Saccharopolyspora erythraecF^ was generated by PGR usmg the following primers 
(SEQ.ID. NOs8and9): 



25 



forward primer: (5* - 3*) 

AAACTGCAGCTGCTGGCTTGCGCCCGATGCTAGTC 
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reverse primer: (5' - 3*) 

AAACTGCAGCAGCTGGGCAGGCCGCTGGACGGCCTGCCCTCGAGCTCGT 
CTAGAATGTGCTGCCGATCCTGGTTGC 

The PGR fragment thus generated contained an .4/wNI site at the 5' end followed 
5 by the authentic promoter region and the first 19 amino acids of the coding region of 
the erythromycin resistance gene, to ensure proper transcription and translation 
initiation. At the 3' end Xbal dndXhoI sites were provided (for insertion of the cobA 
gene in a later stage), a terminator sequence as present downstream from the 
erythromycin resistance gene, and an AlwMl site. 

1 0 The PGR product was digested by AlwNl and ligated to pBRESP36B2, partially 

digested with AlwM, Of the two AlwM sites present in pBRESP36B2, only the one 
present in the pS4S specific part of the vector will accommodate the firagment. E. 
coli transformants were obtained harboring the expected construct, named 
pBRES36pEt. This vector was used for fiirther constructions as described below. 

1 5 The coding sequence of cobk, the gene encoding uroporphyrinogen III 

methyltransferase, was generated by PGR from Propionibacterium freudenreichii 
strain ATCC6207, using the following primers (SEQ. ID. NOs 10 and 1 1): 
forward: (5'- 3') 

CTAGTCTAGACACCGATGAGGAAACCCGATGA 

20 reverse: (5'- 3') 

CCCAAGCTTCTCGAGTCAGTGGTCGCTGGGCGCGCG 

The cob A gene thus amplified carries an Xbal site at the N terminal coding 
region, and Hindlll andJOtol sites at the C terminal coding region. 

The fimctionality of this cob A gene was confirmed by cloning the PGR product 

25 as mXbal -HindSR fi:agment in pUC18, and subsequent transformation of Kcoli 
strain JM109. Transformants with a functional cob A gene show a bright red 
fluorescence when illuminated with UV light. Plasmid DNA isolated from such a 
transformant was digested with^al and ATioI, ligated to likewise digested 
pBRESP36B2. DNA and used for transformation of E. coli, DNA from several 

30 transformants was analysed by restriction enzyme digestion and gel electrophoresis. 
Transformants were found to bear the correct insert in the expression vector. This 
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new vector was named pBRES36COB. This vector was subsequently transferred to 
P. freudenreichii ATCC6207 following the protocol described before. Ten of the 
transformants obtained were analysed and were found to harbor the pBRES36COB 
vector, which was again indistinguishable from the original vector used for 
transformation, as shown by analysis of the restriction enzyme digests. In these ten 
transformants the level of vitamin B,2 synthesis was determined as follows. 

Frozen cultures of Fropionibacterium transformants 1 through 10, as well as a 
control strain containing only the vector plasmid pBRESP36B2, were inoculated in 
100 ml flasks containing 50 ml of BHI (Brain Heart Infusion) medium (Difco) and 
incubated for 72 hrs at 28"*C without shaking. From this preculture 4 ml were 
transferred to 200 ml of production medium consisting of Difco yeast extract 15 g/1, 
Nalactate 30 g/1, KH2PO4 O.SgA, MnS04 0.01 g/1, and CoClj 0.005 gA in a 500 ml 
shake flask and incubated at 28°C for 56 hrs without shaking, followed by 48 hrs in a 
New Brunswick rotary shaker at 200 rpm. 

Vitamin B,2 titres were measured using a known HPLC method*. Nine out of 10 
transformants showed an approx. 25% higher vitanain B" production than the 
control strain. 

EXAMPLE 6 
Stability of the olasmids 

All three shuttle vectors pBRESP36A, pBRESP36Bl, and pBRESP36B2 were 
stably maintained over 30 generations of culturing of the respective transformants: 
no loss of erythromycin resistance was observed as determined by viability counts on 
selective (erythromycin containing) and non-selective agar plates. The structural 
stability of the plasmid in the transformant population after 30 generations was 
established by plasmid DNA isolation and characterisation by restriction en2yme 
mapping as described above: only restriction fragments similar to those of the 
authentic plasmid were observed. 



wo 99/67356 




PCT/EP99/04416 



EXAMPLE 7 

Database sequence homology analysis for predicted polvoeptide encoded bv the first 
open reading frame fSEO.ID. No. 2) 

MDSFETLFPESWLPRKPLASAEKSGAYRHVTRQRALELPYIEANPLVMQSLV 
5 ITDRDASDADWAADLAGLPSPSYVSMNRVTTTGHIVYALKNPVCLTDAARR 
RPINLLARVEQGLCDVLGGDASYGHRITKNPLSTAHATLWGPADALYELRA 
LAHTLDEIHALPEAGNPRRNVTRSTVGRhTVTLFDTTRMWAYRAVRHSWGG 
PVAEWEHTVFEHIHLLNETIIAD 

The above 227 amino acid sequence (ORFl) was aligned and compared with 
1 0 several other protein sequences (target NBRF-PIR, release PIR R52.0 March 1997, 
cut-off 45. KTUP:2). 

With a protein from Mycobacterium fortuitum plasmid pAL 5,000 (JS0052) a 
match of 37.1% was found over 194 amino acids (INIT 167,292). With aprotein 
from Corynebacterium glutamicum (S32701) a match 32.0% over 125 amino acids 
15 was found (INIT 125, 1 16). A match of 29.9% over 221 amino acids (INTT 86, 259) 
was found with the ColE2 protein from E. coli (S04455). Precisely the same match 
over the same number of amino acids was found for the ColE3 protein, also from 
£. co//(S04456). 

EXAMPLE 8 

20 Database sequence homology analys is for predicted polypeptide encoded bv the 
second reading frame (SEQ.ID. No. 3) 

MTTRERLPRN GYSL^AAAKK LGVSESTVKR WTSEPREEFV ARVAARHARI 
RELRSEGQSM RAL\AEVGVS VGTVHYALNKNRTDA 



25 



The second protein (0RF2) was also aligned and compared with anotiier protein, 
using the same parameters and software as described for Example 7. This sequence 
however is only 85 amino acids in length. 
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The 0RF2 sequence was compared with a protein from Mycobacterium 
fortuitum (S32702) and a 53.3% match over 75 amino acids was found (INIT 207, 
207). 

EXAMPLE 9 

5 Functional analysis of plasmid d545 

In order to improve the vector system further, the plasmid functions essential for 
replication and stability were delineated more precisely through deletion of large 
regions of the original plasmid p545. The result, a smaller cloning vector, will allow 
the use of the p545 based vector system for cloning larger DNA frj^ents. 

10 To this end vector pBRESP36A (Figure 2) was digested with Sstll and 5cfl, 

resulting in a 1 .7 kb and a 6.5 kb fragment The 1 .7 kb fragment, in feet the 1 .6 kb 
AlwVl - BcR fragment of plasmid p545, was replaced by a synthetic duplex DNA, 
composed of SEQ. ID. No. 12 and SEQ. ID. No. 1 3 with Sstll and BcH compatible 
ends and a number of unique restriction enzyme recognition sites. 

15 SEQ. ID. No. 12 5' GGAGATCTAGATCGATATCTCGAG 3' 

SEQ. ID. No. 13 5' GATCCTCGAGATATCGATCTAGATCTCCGC 3' 

The following restriction enzyme recognition sites were supplied in this way: 
Sstll (restored), BgRl,Xbal, Clal, EcoRY,Xhol, (Bell is not restored). The 

20 ligation mixture was transferred to K coli, and transfonnants were selected 

containing a vector of the expected composition. The vector was named pBRESAAS- 
B. Subsequent successful transformation of P. freudenreichii strain ATCC6207 with 
this vector indicated that the 1 .6 kb region between MwM and BcH in p545 is not 
essential for replication of the plasmid. 

25 A further deletion was made by removal of Ae 240 bp corresponding to the 

region between SaR and BcR in plasmid p545. This was achieved by digestion of 
pBRESAAS-B with SaR - Sstl, and Sstl - Xhol respectively, and isolation of the 1.3 
kb SaR - Sstl fragment, and the 6.6 kb Sstl - Xhol fragment. The fragments were 
ligated, and the ligation mixture was transferred to P. freudenreichii ATCC6207, 

30 yielding numerous transformants. The newly derived construct, named 
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pBRESAAS-S, was isolated and its structure confirmed by restriction enzyme 
mapping. 

Thus all essential information for replication of plasmid p545 is located on a 
fragment of 1 .7 kb delimited by the restriction sites Sail and AlwM and 
5 encompassing the predicted replication proteins encoded by ORFl and 0RF2, and 
that 1.8 kb can be deleted without obviously disturbing replication or stability of the 
plasmid. 

EXAMPLE 10 

Fv pression of a chloramphenicol resistance pene from Corvnebacterium. 

1 0 A chloramphenicol resistance gene {cml) from Corynebacteriwn^^ has been 

identified as encoding a chloramphenicol export protein. This gene was inserted in 
the Propionibacterium - E, coli shuttle vector pBRESP36B2, This vector was 
digested with Bgtll and i/mdIII, and with BgDl and Hpal respectively. The 2.9 kb 
BgUl - Hindm and 5.2 kb BgRl - Hpal fragments were isolated. 

1 5 The fragment containing the cml gene, including its own promoter, was obtained 

by digestion with Pvull and aVidlll, and the 3.3 kb fragment containing the gene 
was isolated. The two vector-specific fragments and the cml fragment were ligated: 
PvwII and Hpal ends are blunt, thus inserting the cml gene as well as restoring the 
ermE gene of the parent vector. The ligation mixture was transferred to £. coli, and a 

20 transformant was selected, in which the vector contained the correct cml insert. The 
vector was named pBRESBCM. 

Transformation of P. freudenreichii ATCC6207 with this vector, and selection 
on plates containing lOng/ml erythromycin, or 5jig/ml chloramphenicol, yielded 
erythromycin and chloramphenicol resistant colonies, respectively, indicating that 

25 apart from the erythromycin resistance gene (shown earlier with the 

Propionibacterium - E, coli shuttle vectors), also the chloramphenicol resistance 
gene is expressed in Propionibacterium, Transformants could be cultivated in liquid 
mediimi containing up to 20 jig/ml chloramphenicol. 
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KXAMPLE 11 

Expression of lipase (^ehA) gene from P. acnes 

To illustrate efficient cloning and expression of an extracellular protein using 
the present vector system, a lipase gene gehA from P. acnes was used". Vector 
pUL6001 , harbouring gehA on a Xhol fragment, was digested with Xhol, and the 
2.75 kb fragment containing the gene was isolated. Vector pBRESAAS-B (from 
Example 9) was linearized by Xhol, and the ends dephosphorylated using Calf 
Intestine Phosphatase to avoid self ligation. Linearized vector and the gehA 
containing fragment were ligated and the hgation mixture was transferred to E. coli. 
Transformants were analysed by restriction enzyme analysis for the presence of the 
correct recombinant plasmid, named pBRESALIP. This plasmid was subsequently 
transferred to P. freudenreichii strain ATCC6207. Transformants were screened for 
the expression of the lipase gene, using agar plates containing tributyrin as flie 
indicator for increased lipase expression. P. freudenreichii transformants harbouring 
pBRESALIP showed significantly increased halo sizes in this assay as compared to 
untransformed strains or strains transformed with the parmt vector. 
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SUMMARY OF SEQUENCES 

1. DNA sequence of plasmid LMG 16545 (CBS 101022), 3.6 kb. 

2. amino acid of protein of ORFl (303 residues, bases 273-1 184). 

3. amino acid of protein of 0RF2 (85 residues, bases 1 181-1438). 
4-13. DNA primers/oligonuceotides 
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CLAIMS 



1 . A polynucleotide comprising a sequence capable of hybridising 
selectively to 

(a) SEQ ID NO: 1 or the complement thereof; 

(b) a sequence from the 3.6 kb plasmid of Propionibacteriumfreudenreichii 
CBS 101022; 

(c) a sequence from the 3.6 kb plasmid of Propionibacteriumfreudenreichii 
CBS 101023; or 

(d) a sequence that encodes a polypeptide which comprises a SEQ. ID. No. 2 
or 3, an amino acid sequence substantially homologous thereto or a fragment 
of either sequence. 

2. A polynucleotide which is an autonomously replicating plasmid that can 
remain extrachromosomal inside a host cell, which plasmid is derived from an 
endogenous Propionibacterium plasmid, and when comprismg a heterologous gene 
is capable of expressing that gene inside the host cell. 

3. A polynucleotide according to claim 1 which is autonomously replicating 
m a host cell. 

4. A polynucleotide according to claim 3 in which the host cell is a 
Propionibacterium. 

5 . A polynucleotide according to claim 4 in which the Propionibacterium is 
Propionibacterium freudenreichii. 

6. A polynucleotide according to any one of the preceding claims which is 
capable of selectively hybridismg to one or more sequence(s) in SEQ ID No:l 
which is (or are) necessary for autonomous replication in a Propionibacterium. 

7. A polynucleotide according to claim 1 which comprises either the 1.7 kb 
fragment of SEQ. ID. No. 1 delineated by restriction sites Sail and AlwNl or 
nucleotides 1 to 1750 of SEQ. ID. No. 1. 

8. A vector which conq)rises a polynucleotide according to any one of the 
preceding claims. 

9. A vector according to claim 8 which is a plasmid. 

10. A vector according to claim 8 or 9 which additionally comprises a 
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selectable marker. 

11. A vector according to any one of claims 8 to 10 which is autonomously 
replicating in E. coli, 

12. A vector according to any one of claims 8 to 11 which is an expression 

5 vector. 

13. A vector according to claim 12 which comprises an endogenous gene of a 
Propionibacterium or a heterologous gene operatively linked to a control sequence 
which is capable of providing for expression of the gene. 

14. A vector according to claim 13 in which the gene is the cobA gene. 

10 15 . A vector according to claim 13 m which the heterologous gene encodes a 

polypeptide which is therapeutic in a human or animal. 

16. A polypeptide which comprises the sequence SEQ ID No: 2 or 3 or a 
sequence substantially homologous thereto, or a fragment of either said sequence, 
or is encoded by a polynucleotide as defined in any of claims 1 to 7. 

15 17. A host cell comprising a heterogeneous polynucleotide or vector 

according to any one of claims 1 to 15 or which can been transformed or 
transfected with a vector according to any one of claims 13 to 15. 

18. A host cell according to claim 17 which is a bacterium. 

19. A host cell according to claim 18 which is a Propionibacterium or 
20 E. coli. 

20. ' A process for producing a host cell according to any one of claims 17 to 
19 comprising transforming or transfecting a host cell with a polynucleotide or 
vector according to any one of claims 1 to 15. 

21 . A process for the preparation of a polypeptide, or other compound, the 
25 process con^rising cultivating or fermenting a host cell as defined in any one of 

claims 17 to 19 under conditions that allow expression or production of the 
polypeptide or compoimd. 

22. A process according to claun 21 which is a fermentation process wherein 
the host cell is cultured in aerobic or anaerobic conditions, 

30 23 . A process according to claim 21 or 22 in which the expressed 

polypeptide or produced compound is recovered from the host cell. 

24. A process according to claim 23 wherein the polypeptide is a protease. 
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amylase, lipase or peptidase or the compound is vitamin Bij. 

25. A process according to any one of claims 21 to 24 where the polypeptide 
is secreted from the host cell. 

26. A process according to claim 25 in which the polypeptide is expressed on 
5 the surface of the host cell and/or the polypeptide is an antigen or immunogen. 

27. A polypeptide or compound prepared by a process according to any one 
of claims 20 to 26. 

28. A process for the production of vitamin (cobalamin), the process 
comprising culturing a host cell according to any one of claims 17 to 19 under 

10 conditions in which the vitamin is produced and, if necessary, isolating the 
vitamin. 

29. Vitamin B^j produced by a process according to claim 28. 

30. A polypeptide accordmg to claim 27 for use in a method of treating the 
human or ammal body by therapy, 

15 SLA host cell according to any one of claims 17 to 19 for use in a method 

of treating the human or animal body by therapy or for use in an animal feed. 

32. Use of a host cell according to any one of claims 17 to 19 or a 
polypeptide or compound accordmg to claun 27 to either make cheese or for use in 
cheesemaking. 

20 33 . Use of a host cell according to any one of the clauns 17 to 19 or a 

polypeptide or compound according to claun 27, in the manufacture of a foodstuff 

or in an animal feed. 

34. A foodstuff comprising a polypeptide or compound according to claim 27 

or a host cell according to any of clauns 17 to 19. 
25 35. A foodstuff according to claim 34 for consumption by humans (e.g. a 

cheese, sausage) or by an animal. 

36. A process for manufacturing cheese or other fermented dauy product the 
process comprising using a host cell according to any of claims 17 to 19. 

37. A process according to claim 36 wherein the host cell is used instead of 
30 or in addition to lactic acid bacteria. 

38. A process according to claim 36 or 37 wherein the host cell is a 
Propionibacterium cell. 
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Figure 1 
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Figure 2b 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Gist-brocades B.V. 

(B) STREET: Wateringseweg 1 

(C) CITY: Delft 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP) : 2611 XT . 

(ii) TITLE OF INVENTION: Propionibacterium Vector 
(iii) NUMBER OF SEQUENCES: 13 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQX7ENCE CHARACTERISTICS: 

(A) LENGTH: 3555 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Propionibacterium f reudenreichii 

(C) INDIVIDUAL ISOLATE: CBS101022 LMG16545 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 273.. 1184 

(D) OTHER INFORMATION: /gene= "ORFl" 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1181.. 1438 

(D) OTHER INFORMATION: /gene= "0RF2" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GTCGACCCTG ACAGCCGGCG AGCAGTTCAG GCGAAGATCG CACAGCTGCG CGAGGAACTA 
60 
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GCCGCAATGC CQGAACACGC CCCAGCCATC CCTTGGAGCA GGTGGCAGCG TCAGGGGAGT 



120 



CGGGGGATGT TTGGCAGGGG ATGTGGAAAG AGAGTTCGCT TTGCTCACAT GGCTCAACCG 



180 



GGTAACTAAC TGATATGGGG TCTTCGTCGC CCACTTTGAA CACGCCGAGG AATGGACCAC 



240 



GCTGAACGTG ACTCGCATGC TTCACTGCAT GT ATG GAT TCG TTC GAG ACG TTG 
293 

















Met Asp Ser Phe Glu Thr Leu 
1 5 


TTC 


OCT GAG 


AGC 


TGG 


CTG 


CCA 


GGC 


AAG GCG CTG GCG TCA GCC GAG AAG 


341 

Ph@ 


Pro Glu 
10 


Ser 


Tro 


Leu 


Pro 


Arq 
15 


Lys Pro Leu Ala Ser Ala Glu Lys 
20 


TCT 


GGG GCG 


TAG 


CGG 


GAG 


GTG 


ACT 


CGG GAG AGG GCG GTG GAG CTG CCT 


389 


Gly Ala. 
25 


Tvr 


Arcr 


His 


Val 
30 


Thr 


Arg Gin Arg Ala Leu Glu Leu Pro 
35 


TAG 


ATC GAA 


GCG 


AAC 


CCG 


TTG 


GTG 


ATG GAG TCC TTG GTG ATC ACC GAT 


437 
40 


Tie Glu 


Ala 


Asn 


Pro 
45 


Leu 


Val 


Met Gin Ser Leu Val lie Thr Asp 
50 55 


CGA 


GAT 6CT 


TCG 


GAT 


GCT 


GAG 


TGG 


GCC GCA GAG CTC GCT GGG CTG CCT 


485 


Asp Ala 


Ser 


ASD 
60 


Ala 


Asp 


Trp 


Ala Ala Asp Leu Ala Gly Leu Pro 
65 70 


TCA 


CCG TCC 


TAG 


GTG 


TCC 


ATG 


AAC 


GGT GTC ACG AGC ACC GGA CAC ATC 


533 


Pro Ser 


Tvr 
75 


Val 


Ser 


Met 


Asn 


Arg Val Thr Thr Thr Gly His lie 
80 65 












CCT 


GTG 


TGT CTG ACC GAT GCC GCG CGG CGA 


581 

Val 


Tyr Ala 
90 


Leu 


Lys 


Asn 


Pro 


Val 
95 


Cys Leu Thr Asp Ala Ala Arg Arg 
100 


CGG 


CCT ATC 


AAC 


CTG 


CTC 


GCC 


GGC 


GTC GAG GAG GGG GTA TGG GAG GTT 


629 
Arg 


Pro lie 
105 


ASXI 


Leu 


Leu 


Ala 
110 


Arg 


Val Glu Gin Gly Leu Cys Asp Val 
115 


CTC 


GGC GGC 


GAT 


GCA 


TCC 


TAG 


GGG 


CAC GGG ATC ACA AAG AAC CCG CTC 


677 
Leu 
120 


Gly Gly 


Asp 


Ala 


Ser 
125 


Tyr 


Gly 


His Arg He Thr Lys Asn Pro Leu 
130 135 


AGC 


ACC GCC 


CAT 


GCG 


ACC 


CTG 


TGG 


GGC CCC GCA GAC GCG CTC TAG GAG 


725 
Ser 


Thr Ala 


His 


Ala 
140 


Thr 


Leu 


Trp 


Gly Pro Ala Asp Ala Leu Tyr Glu 
145 150 
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CTG CGC GCC CTC OCA CAC ACC CTC GAC GAG ATC CAC GCA CTG CCG GAG 
773 

Leu Arg Ala Leu Ala His Thr Leu Asp Glu He His Ala Leu Pro Glu 
155 160 165 

5 GCA GGG AAC CCG C6T CGC AAC GTC ACC CGA TCA ACG GTC GGC CGC AAC 
821 

Ala Gly Asn Pro Arg Arg Asn Val Thr Arg Ser Thr Val Gly Arg Asn 
170 175 180 

GTC ACC CTG TTC GAC ACC ACC CGC ATG TGG GCA TAC CGG GCC GTC CGG 
10 869 

Val Thr Leu Phe Asp Thr Thr Arg Met Trp Ala Tyr Arg Ala Val Arg 
185 190 195 

CAC TCC TGG GGC GGC CCG GTC GCC GAA TGG GAG CAC ACC GTA TTC GAG 

917 

15 His Ser Trp Gly Gly Pro Val Ala Glu Trp Glu His Thr Val Phe Glu 
200 205 210 215 

CAC ATC CAC CTA CTG AAC GAG ACG ATC ATC GCC GAC GAA TTC GCC ACA 
965 

His He His Leu Leu Asn Glu Thr He He Ala Asp Glu Phe Ala Thr 
20 220 225 230 

GGC CCC CTC GGC TTG AAC GAA CTT AAG CAC TTA TCT CGA TCC ATT TCC 
1013 

Gly Pro Leu Gly Leu Asn Glu Leu Lys His Leu Ser Arg Ser He Ser 
235 240 245 

25 CGA TGG GTC TGG CGC AAC TTC ACC CCC GAA ACC TTC CGC GCA CGC CAG 
1061 

Arg Trp Val Trp Arg Asn Phe Thr Pro Glu Thr Phe Arg Ala Arg Gin 
250 255 260 

AAA GCG ATC AGC CTC CGT GGA GCA TCC AAA GGC GGC AAA GAA GGC GGC 
30 1109 

Lys Ala He Ser Leu Arg Gly Ala Ser Lys Gly Gly Lys Glu Gly Gly 
265 270 275 

CAC AAA GGC GGC ATT GCC AGT GGC GCA TCA CGG CGC GCC CAT ACC CGT 
1157 

35 His Lys Gly Gly He Ala Ser Gly Ala Ser Arg Arg Ala His Thr Arg 
280 285 290 295 

CAA CAG TTC TTG GAG QGT CTC TCA TGACCACACG TGAACGTCTC CCCCGCAACG 

i2ii 

Gin Gin Phe Leu Glu Gly Leu Ser 
40 300 

GCTACAGCAT CGCCGCTGCT GCGAAAAAGC TCGGTGTCTC CGAGTCCACC GTCAAGCGGT 
1271 

GQACTTCCGA GCCACGCGAG QAGTTCGTGG CCCGCGTTGC CGCACGCCAC GCGCGGATTC 
1331 

45 GTGAGCTCCG CTCGGAGGGT CAGAGCATGC GTGCGATTGC TGCCGAGGTC GGGGTTTCCG 
1391 
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TGGGCACCGT 6CACTACGCG CTGAACAAGA ATCGAACTGA CGCATGACCG TAACGCCGCA 
1451 

CGATGAGCAT TTTCTTGATC GTGCACCGCT TGGCACTACG TTCGCGTGCG GTTGCACAGT 
1511 

GCGCGCCACG TTCTTATCCT GCGGCCATTG TGGCTACAGC CAATGGGGGG CATCAGCAAC 
1571 

GGACGTTGAA CCCGGTGGGC AAGTGTTACT CAGGGGGACA TGCCCAGTCT GCGGCGCTCG 
1631 

GATTGACGGT ATGGCAGTCG TGCATGCGGC CCCACCGTCA AACTCATTCA GGTATCAGTG 
1691 

AGAACCCTCA TGGCACCCCC TCGTGACACG TTCTCGTTGC GATCAGCTGC TGTGCX3TGCG 
1751 

GGCGTGAGCG TTTCTACGCT GCGGCGCAGG AAATCAGAGC TTGAGGCTGC CGGAGCGACG 
1811 

GTAGACCCGT CCGGTTGGGT GGTGCCACTG CGTGCACTCA AGGTCGTTTT TGGGGTGTCA 
1871 

GATGAGACCT CGAATGCGCC CGGTCATGAC GCTGAGTTAG TGGCGCAGCT GCGCTCTOAG 
1931 

AACGAGTTTT TACGGCGTCA GGTCGAGCAG CAGGCGCGCA CGATCGAACG GCAGGCTGAG 
1991 

GCACACGCGG TGGTCTCAGC GCAGCTCACA CGGGTTGGCC AGCTTGAGGC CGGCGACGCA 
2051 

GCAGCACCGA CACTGGCACC CGTTGAAAGG CCGGCTCCGC GACGGCGGTG GTGGCAGCGT 
2111 

CGGTAGCGGT CAGGATCGCT CTGGCGTGAC GAGTGTGTCT GGCA6TGCGA ACAGTTGCTC 
2171 

GACCAGTGGC AGCAGAAGCG AGATCGCTGC GTGGTGCTGT TCCTCGGTCA GTTCGTCGAG 
2231 

GACTGGCGGG TCTTGCTQCG TCCAQCCGAT CGCCTCGGCG GCCAAGGTCA GTTCCAAGCT 
2291 

GTGCCAACGC ACACGCCCCT CX5GCTGACAG CTGAGTCTCG AACTGTGCAA CTGGACCGGC 
2351 

CGGAAGATGC ACGTTGCCGA GGTCGTGA6T GGCCAAGCGC ACGTCAAAGA GTGCTGCTTC 
2411 

GTAGCCX5CGC AGAAATGGCA GTGCTCGGTC GATTCGGATC GGCCTGCCCA GGTACATTCC. 
2471 

GGGCCGCTTG ATGAACX3CCT CCGCGTAGAA GCGCACCGTT CTCGGCCCX3G CCTCGTGATC 
2531 

TGTCACTGTG CACGCTCCTC TCGATGGTTC TCGACGCTAC CGGAGACCAC CGACGTTCAT 
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2591 

GCCCAGCGCA GCGACCTGAA AGGACCAAGC CGAGTTAGCC GTGCTAACCG TATAGCTTGC 
2651 

TCCGTCGCCT CTGAGGGCAA CCACCTGCGC AGCAGGTGGG CGGCAGCCCG CGCGCAAGCG 
5 2711 

CCTACCGGGT TTGGGCACAG CCCATAAATC AACGCCTCCG GTGTTGAAGC GATCGTGTGT 
2771 

CACGATTGCT ATGCTTGCTA CCCCTTCAGG GTTTTCGTAT ACACAAATCA AGTTTTTTCG 
2831 

10 TATACGCTAA TGCCATGAGT GAGCATCTAC TGCACGGCAA GCCCGTCACC AACGAGCAGA 
2891 

TTCAGGCATG GGCAGACGAG GCCGAGGCCG GATACGACCT GCCCAAACTC CCCAAGCCAC 
2951 

GGCGCGGACG CCCGCCCGTA GGAGACGGTC CGGGCACCGT CGTACCCGTG CGTCTCGACG 
15 3011 

CGGGCACCGT TGCCGCTCTC ACAGAACGAG CAACAGCCGA GGGCATCACG AACCGTTCAG 
3071 

ACGCGATCCG AGCCGCAGTC CACQAGTOGA CACGGGTTGC CTGACCTCCA CGACTCAGCA 
3131 

20 CGCAAGCACT ACCAACGAGA CCGGCTCGAC GACACGGCCG TGCTCTACGC GGCCACCCAC 
3191 

GTTCTCAACT CCCGGCCACT CGACGACGAA GACGACCCGC GCCGCTGGCT CATQATCGGA 
3251 

ACCGACCCAG CAGGCCGCCT ACTCGAACTC GTCGCACTGA TCTACGACGA CGGCTACGAA 
25 3311 

CTGATCATCC ACGCAATGAA AGCCCGCACC CAATACCTCG ACCAGCTCTA ACCAAGAAAG 
3371 

GAACCTGATG AGCGACCAGC TAGACAGCGA CCGCAACTAC GACCCGATGA TCTTCGACGT 
3431 

30 GATGCGCGAG ACCGCGAACC GCGTCGTCGC CACGTACGTT GCATGGGAAG ATGAAGCCGC 
3491 

TGATCCCCGC GAGGCTGCGC ACTGGCAGGC CGAGCGATTC CGCACCCGGC ACGAGGTGCG 
3551 

CGCC 
35 3555 

(2) INFORMATION FOR SBQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 303 amino acids 
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(B). TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

5 Met Asp Ser Phe Glu Thr Leu Phe Pro Glu Ser Trp Leu Pro Arg Lys 
X 5 10 15 

pro Leu Ala Ser Ala Glu Lys Ser Gly Ala Tyr Arg His Val Thr Arg 
20 25 30 

Gin Arg Ala Leu Glu Leu Pro Tyr He Glu Ala Asn Pro Leu Val Met 
10 35 40 45 

Gin Ser Leu Val .He Thr Asp Arg Asp Ala Ser Asp Ala Asp Trp Ala 
50 55 60 

Ala Asp Leu Ala Gly Leu Pro Ser Pro Ser Tyr Val Ser Met Asn Arg 
65 70 75 80 

15 Val Thr Thr Thr Gly His He Val Tyr Ala Leu Lys Asn Pro Val Cys 

85 90 95 

Leu Thr Asp Ala Ala Arg Arg Arg Pro He Asn Leu Leu Ala Arg Val 
100 105 HO 

Glu Gin Gly Leu Cys Asp Val Leu Gly Gly Asp Ala Ser Tyr Gly His 
20 115 120 125 

Arg He Thr Lys Asn Pro Leu Ser Thr Ala His Ala Thr Leu Trp Gly 
130 135 140 

Pro Ala Asp Ala Leu Tyr Glu Leu Arg Ala Leu Ala His Thr Leu Asp 
145 150 155 160 

25 Glu He His Ala Leu Pro Glu Ala Gly Asn Pro Arg Arg Asn Val Thr 

165 170 175 

Arg Ser Thr Val Gly Arg Asn Val Thr Leu Phe Asp Thr Thr Arg Met 
ISO 185 190 

Trp Ala Tyr Arg Ala Val Arg His Ser Trp Gly Gly Pro Val Ala Glu 
30 195 200 205 

Trp Glu His Thr Val Phe Glu His He His Leu Leu Asn Glu Thr He 
210 215 220 

He Ala Asp Glu Phe Ala Thr Gly Pro Leu Gly Leu Asn Glu Leu Lys 
225 230 235 240 

35 His Leu Ser Arg Ser He Ser Arg Trp Val Trp Arg Asn Phe Thr Pro 

245 250 255 

Glu Thr Phe Arg Ala Arg Gin Lys Ala He Ser Leu Arg Gly Ala Ser 
260 265 270 
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Lys Gly Gly Lys Glu Gly Gly His Lys Gly Gly He Ala Ser Gly Ala 
275 * 280 285 

Ser Arg Arg Ala His Thr Arg Gin Gin Phe Leu Glu Gly Leu Ser 
290 295 300 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Thr Thr Arg Glu Arg Leu Pro Arg Asn Gly Tyr Ser He Ala Ala 
1 5 10 15 

Ala Ala Lys Lys Leu Gly Val Ser Glu Ser Thr Val Lys Arg Trp Thr 
20 25 30 

Ser Glu Pro Arg Glu Glu Phe Val Ala Arg Val Ala Ala Arg His Ala 
35 40 45 

Arg He Arg Glu Leu Arg Ser Glu Gly Gin Ser Met Arg Ala He Ala 
50 55 60 

Ala Glu Val Gly Val Ser Val Gly Thr Val His Tyr Ala Leu Asn Lys 
65 70 .75 80 

Asn Arg Thr Asp Ala 
85 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

AATTCAAGCT TGTCGACGTT AACCTGCAGG CATGCGGATC CGGTACCGAT ATCAGATCT 
59 

(2) INFORMATION FOR SEQ ID NO: 5: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) . LENGTH: 59 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOIiOGY: linear 

(ii) MOIiECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CCGAAGATCT GATATCGGTA CCGGATCCGC ATGCCTGCAG GTTAACGTCG ACAAGCTTG 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GTACCGGCCG CTGCGGCCAA GCTT 
24 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQt^ENCE DESCRIPTION: SEQ ID NO: 7: 

GATCAAGCTT GGCCGCAGCG GCCQ 
24 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 
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(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 8: 

AAACTGCAGC TGCTGGCTTG CGCCCGATGC TAGTC 
35 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHTJIACTERISTICS : 

(A) LENGTH: 76 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 9: 

AAACTGCAGC AGCTGGGCAG GCCGCTGGAC GGCCTGCCCT CGAGCTCGTC TAGAATGTGC 
60 

TGCCGATCCT GGTTGC 
76 

(2) INFORMATION FOR SBQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CTAGTCTAGA CACCQATOAG (3AAACCC(SAT GA 
32 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTTERISTICS : 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CCCAAGCTTC TCGAGTCAGT GGTCGCTGGG CGCGCG 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GGAGATCTAGATCGATATCTCGAG 
24 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (synthetic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GATCCTCGAGATATCGATCTA(3ATCTCCGC 
30 



